首页> 外文OA文献 >Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration
【2h】

Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration

机译:具有可分离滤波器的二值化卷积神经网络   高效的硬件加速

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

State-of-the-art convolutional neural networks are enormously costly in bothcompute and memory, demanding massively parallel GPUs for execution. Suchnetworks strain the computational capabilities and energy available to embeddedand mobile processing platforms, restricting their use in many importantapplications. In this paper, we push the boundaries of hardware-effective CNNdesign by proposing BCNN with Separable Filters (BCNNw/SF), which appliesSingular Value Decomposition (SVD) on BCNN kernels to further reducecomputational and storage complexity. To enable its implementation, we providea closed form of the gradient over SVD to calculate the exact gradient withrespect to every binarized weight in backward propagation. We verify BCNNw/SFon the MNIST, CIFAR-10, and SVHN datasets, and implement an accelerator forCIFAR-10 on FPGA hardware. Our BCNNw/SF accelerator realizes memory savings of17% and execution time reduction of 31.3% compared to BCNN with only minoraccuracy sacrifices.
机译:最先进的卷积神经网络在计算和内存上都非常昂贵,需要大量并行GPU来执行。这样的网络限制了嵌入式和移动处理平台可用的计算能力和能量,从而限制了它们在许多重要应用中的使用。在本文中,我们通过提出带有可分离滤波器的BCNN(BCNNw / SF)来突破硬件有效的CNN设计的边界,该算法将奇异值分解(SVD)应用于BCNN内核,以进一步降低计算和存储的复杂性。为了实现其实现,我们提供了SVD上梯度的封闭形式,以计算相对于反向传播中每个二值化权重的精确梯度。我们在MNIST,CIFAR-10和SVHN数据集上验证BCNNw / SF,并在FPGA硬件上实现CIFAR-10的加速器。与BCNN相比,我们的BCNNw / SF加速器节省了17%的内存,执行时间减少了31.3%,而仅牺牲了少量精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号